Search CORE

43 research outputs found

Parallel Programming with Global Asynchronous Memory: Models, C++ APIs and Implementations

Author: Drocco Maurizio
Publication venue
Publication date: 01/01/2017
Field of study

In the realm of High Performance Computing (HPC), message passing has been the programming paradigm of choice for over twenty years. The durable MPI (Message Passing Interface) standard, with send/receive communication, broadcast, gather/scatter, and reduction collectives is still used to construct parallel programs where each communication is orchestrated by the developer-based precise knowledge of data distribution and overheads; collective communications simplify the orchestration but might induce excessive synchronization. Early attempts to bring shared-memory programming model—with its programming advantages—to distributed computing, referred as the Distributed Shared Memory (DSM) model, faded away; one of the main issue was to combine performance and programmability with the memory consistency model. The recently proposed Partitioned Global Address Space (PGAS) model is a modern revamp of DSM that exposes data placement to enable optimizations based on locality, but it still addresses (simple) data- parallelism only and it relies on expensive sharing protocols. We advocate an alternative programming model for distributed computing based on a Global Asynchronous Memory (GAM), aiming to avoid coherency and consistency problems rather than solving them. We materialize GAM by designing and implementing a distributed smart pointers library, inspired by C++ smart pointers. In this model, public and pri- vate pointers (resembling C++ shared and unique pointers, respectively) are moved around instead of messages (i.e., data), thus alleviating the user from the burden of minimizing transfers. On top of smart pointers, we propose a high-level C++ template library for writing applications in terms of dataflow-like networks, namely GAM nets, consisting of stateful processors exchanging pointers in fully asynchronous fashion. We demonstrate the validity of the proposed approach, from the expressiveness perspective, by showing how GAM nets can be exploited to implement both standalone applications and higher-level parallel program- ming models, such as data and task parallelism. As for the performance perspective, preliminary experiments show both close-to-ideal scalability and negligible overhead with respect to state-of-the-art benchmark implementations. For instance, the GAM implementation of a high-quality video restoration filter sustains a 100 fps throughput over 70%-noisy high-quality video streams on a 4-node cluster of Graphics Processing Units (GPUs), with minimal programming effort

ZENODO

NEUROSURGERY ENTHUSIASTIC WOMEN SOCIETY

Institutional Research Information System University of Turin

A Comparison of Big Data Frameworks on a Layered Dataflow Model

Author: Aldinucci Marco
Drocco Maurizio
Misale Claudia
Tremblay Guy
Publication venue
Publication date: 16/06/2016
Field of study

In the world of Big Data analytics, there is a series of tools aiming at simplifying programming applications to be executed on clusters. Although each tool claims to provide better programming, data and execution models, for which only informal (and often confusing) semantics is generally provided, all share a common underlying model, namely, the Dataflow model. The Dataflow model we propose shows how various tools share the same expressiveness at different levels of abstraction. The contribution of this work is twofold: first, we show that the proposed model is (at least) as general as existing batch and streaming frameworks (e.g., Spark, Flink, Storm), thus making it easier to understand high-level data-processing applications written in such frameworks. Second, we provide a layered model that can represent tools and applications following the Dataflow paradigm and we show how the analyzed tools fit in each level.Comment: 19 pages, 6 figures, 2 tables, In Proc. of the 9th Intl Symposium on High-Level Parallel Programming and Applications (HLPP), July 4-5 2016, Muenster, German

arXiv.org e-Print Archive

Institutional Research Information System University of Turin

Scaling Dense Linear Algebra on Multicore and Beyond: a Survey

Author: Aldinucci Marco
Drocco Maurizio
Viviani Paolo
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2018
Field of study

Crossref

Institutional Research Information System University of Turin

On Designing Multicore-aware Simulators for Biological Systems

Author: Aldinucci Marco
Coppo Mario
Damiani Ferruccio
Drocco Maurizio
Torquati Massimo
Troina Angelo
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 13/10/2010
Field of study

The stochastic simulation of biological systems is an increasingly popular technique in bioinformatics. It often is an enlightening technique, which may however result in being computational expensive. We discuss the main opportunities to speed it up on multi-core platforms, which pose new challenges for parallelisation techniques. These opportunities are developed in two general families of solutions involving both the single simulation and a bulk of independent simulations (either replicas of derived from parameter sweep). Proposed solutions are tested on the parallelisation of the CWC simulator (Calculus of Wrapped Compartments) that is carried out according to proposed solutions by way of the FastFlow programming framework making possible fast development and efficient execution on multi-cores.Comment: 19 pages + cover pag

arXiv.org e-Print Archive

CiteSeerX

Crossref

Archivio della Ricerca - Università di Pisa

Archivio istituzionale della ricerca - Alma Mater Studiorum Università di Bologna

Institutional Research Information System University of Turin

A Comparison of Big Data Frameworks on a Layered Dataflow Model

Author: Aldinucci Marco
Drocco Maurizio
Misale Claudia
Tremblay Guy
Publication venue: 'World Scientific Pub Co Pte Lt'
Publication date: 01/01/2017
Field of study

Institutional Research Information System University of Turin

PiCo: a Novel Approach to Stream Data Analytics

Author: Aldinucci Marco
Drocco Maurizio
Misale Claudia
Tremblay Guy
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2018
Field of study

In this paper, we present a new C++ API with a fluent interface called PiCo (Pipeline Composition). PiCo’s programming model aims at making easier the programming of data analytics applications while preserving or enhancing their performance. This is attained through three key design choices: 1) unifying batch and stream data access models, 2) decoupling processing from data layout, and 3) exploiting a stream-oriented, scalable, efficient C++11 runtime system. PiCo proposes a programming model based on pipelines and operators that are polymorphic with respect to data types in the sense that it is possible to re-use the same algorithms and pipelines on different data models (e.g., streams, lists, sets, etc.). Preliminary results show that PiCo can attain better performances in terms of execution times and hugely improve memory utilization when compared to Spark and Flink in both batch and stream processing.Author's copy (postprint) of C. Misale, M. Drocco, G. Tremblay, and M. Aldinucci, "PiCo: a Novel Approach to Stream Data Analytics," in Proc. of Euro-Par Workshops: 1st Intl. Workshop on Autonomic Solutions for Parallel and Distributed Data Stream Processing (Auto-DaSP 2017), Santiago de Compostela, Spain, 2018. doi:10.1007/978-3-319-75178-8_1

ZENODO

Institutional Research Information System University of Turin

Stochastic Calculus of Wrapped Compartments

Author: Alessandra Di Pierro
Angelo Troina
Elena Grassi
Ferruccio Damiani
Gethin Norman
Mario Coppo
Maurizio Drocco
Publication venue: 'Open Publishing Association'
Publication date: 01/01/2010
Field of study

The Calculus of Wrapped Compartments (CWC) is a variant of the Calculus of Looping Sequences (CLS). While keeping the same expressiveness, CWC strongly simplifies the development of automatic tools for the analysis of biological systems. The main simplification consists in the removal of the sequencing operator, thus lightening the formal treatment of the patterns to be matched in a term (whose complexity in CLS is strongly affected by the variables matching in the sequences). We define a stochastic semantics for this new calculus. As an application we model the interaction between macrophages and apoptotic neutrophils and a mechanism of gene regulation in E.Coli

arXiv.org e-Print Archive

CiteSeerX

Crossref

Directory of Open Access Journals

Archivio istituzionale della ricerca - Alma Mater Studiorum Università di Bologna

Open Access Repository

Accelerating spectral graph analysis through wavefronts of linear algebra operations

Author: Aldinucci Marco
Colonnelli Iacopo
Drocco Maurizio
Grangetto Marco
Viviani Paolo
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2019
Field of study

Institutional Research Information System University of Turin